- Review data visualization principles
- Look at applications in education data
- Challenges in an LEA/SEA
- Best practices and advice
- What tools to use
Jared Knowles
Policy Research Advisor, Wisconsin DPI
qplot(hp, mpg, data = mtcars) + theme_dpi()
There are a few things that all charts need. There are sometimes strong cases to deviate from these, but they are good rules of thumb.
How you turn dimensions in the data into visual cues for your audience is everything.
| Level of Meas. | Stats |
|---|---|
| Nominal | mode, Chi-squared |
| Ordinal | median, percentile |
| Interval | mean, std. deviation, correlation, ANOVA |
| Continuous | geometric mean, harmonic mean, logarithms |
How do we map levels of measurement onto visual features of charts?
| Aesthetic | Discrete | Continuous |
|---|---|---|
| Color | Disparate colors | Sequential or divergent colors |
| Size | Unique size for each value | mapping to radius of value |
| Shape | A shape for each value | does not make sense |
| Aesthetic | Ordered | Unordered |
|---|---|---|
| Color | Sequential or divergent colors | Rainbow |
| Size | Increasing or decreasing radius | does not make sense |
| Shape | does not make sense | A shape for each value |
Think like a map. Data density and easy interpretability.
How do we display a ton of data--tens or hundreds of thousands of observations--with combinations of data types?
Let's look at some examples of this.
Here is a simple plot of mean school reading scores:
But, what's wrong with this plot?
With the same space, what additional information are we providing?
How can we do even better?
We still aren't sure what the mean scale score means. Let's see a couple more additions that can make this useful.
Sometimes, we can get away with showing the raw data, that is, all data points. We may want to do this for a few reasons:
How could it be done?
All models are wrong. Some models are useful.
Trees are ways to divide up the variation in a dataset and rank the explanatory values.
We can combine these features.
They can communicate, confound, brand, and distract
We have a number of other techniques we can use beyond simple charts.
The technology you choose to do visualizations is largely a question of personal productivity, but with some important caveats:
Visualize some education data. Imagine we have the following dimensions and want to present more of them on a plot like that on the right.